AITopics | bayesian model averaging

Bayesian model averaging in support-indexed regression induces a posterior distribution over active predictor supports. Under predictor redundancy, posterior mass can spread across many nearly interchangeable supports, making exact-support summaries unstable or hard to interpret even when prediction is stable. We study how to report an already fitted Bayesian model averaging posterior without changing the Bayesian target. A report uses hard or soft regions of support space, and its compressed reporting law is compared with the reference posterior through an explicit density ratio. This ratio gives computable total-variation and Kullback--Leibler distortion, bounds for bounded predictive summaries, retained-mass diagnostics, and fallback-weight diagnostics. The framework covers fixed hard regions, metric-ball regions, posterior-cluster regions, and pooled-pruned region dictionaries. We prove exact error formulas and validation bounds for these region reports, and give conditions under which a few regions can replace a long list of individual supports. In simulations, our region reports often give shorter and clearer summaries while preserving the main posterior information, and the density-ratio diagnostics show when too much information has been lost.

artificial intelligence, machine learning, theorem, (17 more...)

arXiv.org Machine Learning

2606.2108

Country:

North America > Canada (0.28)
North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Add feedback

Dangers of Bayesian Model Averaging under Covariate Shift

Neural Information Processing SystemsDec-23-2025, 20:07:37 GMT

Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this surprising result, showing how a Bayesian model average can in fact be problematic under covariate shift, particularly in cases where linear dependencies in the input features cause a lack of posterior contraction. We additionally show why the same issue does not affect many approximate inference procedures, or classical maximum a-posteriori (MAP) training. Finally, we propose novel priors that improve the robustness of BNNs to many sources of covariate shift.

bayesian model averaging, covariate shift, name change, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

Input Adaptive Bayesian Model Averaging

Slavutsky, Yuli, Salazar, Sebastian, Blei, David M.

arXiv.org Machine LearningOct-28-2025

This paper studies prediction with multiple candidate models, where the goal is to combine their outputs. This task is especially challenging in heterogeneous settings, where different models may be better suited to different inputs. We propose input adaptive Bayesian Model Averaging (IA-BMA), a Bayesian method that assigns model weights conditional on the input. IA-BMA employs an input adaptive prior, and yields a posterior distribution that adapts to each prediction, which we estimate with amortized variational inference. We derive formal guarantees for its performance, relative to any single predictor selected per input. We evaluate IABMA across regression and classification tasks, studying data from personalized cancer treatment, credit-card fraud detection, and UCI datasets. IA-BMA consistently delivers more accurate and better-calibrated predictions than both non-adaptive baselines and existing adaptive methods. Many applications require adaptive predictions. In personalized medicine, different patients respond differently to the same treatment (Mahajan et al., 2023); in fairness-sensitive domains, predictions need to adapt to subpopulations (Wang et al., 2019; Grother et al., 2019); and in fraud detection, behavioral data is often heteroskedastic and varies substantially across inputs (V armedja et al., 2019).

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2510.22054

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (0.89)
Health & Medicine > Therapeutic Area > Oncology (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Dangers of Bayesian Model Averaging under Covariate Shift

Neural Information Processing SystemsMay-26-2025, 16:17:18 GMT

Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this surprising result, showing how a Bayesian model average can in fact be problematic under covariate shift, particularly in cases where linear dependencies in the input features cause a lack of posterior contraction. We additionally show why the same issue does not affect many approximate inference procedures, or classical maximum a-posteriori (MAP) training. Finally, we propose novel priors that improve the robustness of BNNs to many sources of covariate shift.

artificial intelligence, bayesian model averaging, machine learning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Dangers of Bayesian Model Averaging under Covariate Shift

Neural Information Processing SystemsOct-9-2024, 16:28:48 GMT

Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this surprising result, showing how a Bayesian model average can in fact be problematic under covariate shift, particularly in cases where linear dependencies in the input features cause a lack of posterior contraction. We additionally show why the same issue does not affect many approximate inference procedures, or classical maximum a-posteriori (MAP) training. Finally, we propose novel priors that improve the robustness of BNNs to many sources of covariate shift.

bayesian model averaging, covariate shift, danger, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Beyond Bayesian Model Averaging over Paths in Probabilistic Programs with Stochastic Support

Reichelt, Tim, Ong, Luke, Rainforth, Tom

arXiv.org Artificial IntelligenceOct-23-2023

The posterior in probabilistic programs with stochastic support decomposes as a weighted sum of the local posterior distributions associated with each possible program path. We show that making predictions with this full posterior implicitly performs a Bayesian model averaging (BMA) over paths. This is potentially problematic, as model misspecification can cause the BMA weights to prematurely collapse onto a single path, leading to sub-optimal predictions in turn. To remedy this issue, we propose alternative mechanisms for path weighting: one based on stacking and one based on ideas from PAC-Bayes. We show how both can be implemented as a cheap post-processing step on top of existing inference engines. In our experiments, we find them to be more robust and lead to better predictions compared to the default BMA weights.

bayesian model averaging, probabilistic program, slp, (12 more...)

arXiv.org Artificial Intelligence

2310.14888

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Automating Model Comparison in Factor Graphs

van Erp, Bart, Nuijten, Wouter W. L., van de Laar, Thijs, de Vries, Bert

arXiv.org Artificial IntelligenceJul-28-2023

The famous aphorism of George Box states: "all models are wrong, but some are useful" [1]. It is the task of statisticians and data analysts to find a model which is most useful for a given problem. The build, compute, critique and repeat cycle [2], also known as Box's loop [3], is an iterative approach for finding the most useful model. Any efforts in shortening this design cycle increase the chances of developing more useful models, which in turn might yield more reliable predictions, more profitable returns or more efficient operations for the problem at hand. In this paper we choose to adopt the Bayesian formalism and therefore we will specify all tasks in Box's loop as principled probabilistic inference tasks. In addition to the well-known parameter and state inference tasks, the critique step in the design cycle is also phrased as an inference task, known as Bayesian model comparison, which automatically embodies Occam's razor [4, Ch. 28.1]. Opposed to just selecting a single model in the critique step, for different models we better quantify our confidence about which model is best, especially when data is limited [5, Ch. 18.5.1]. The uncertainty arising from prior beliefs p(m) over a set of models m and limited observations can be naturally included through the use of Bayes' theorem

artificial intelligence, bayesian inference, machine learning, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/e25081138

2306.05965

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.72)

Add feedback

Target Identification and Bayesian Model Averaging with Probabilistic Hierarchical Factor Probabilities

Basener, William

arXiv.org Artificial IntelligenceJul-21-2022

Target detection in hyperspectral imagery is the process of locating pixels from an image which are likely to contain target, typically done by comparing one or more spectra for the desired target material to each pixel in the image. Target identification is the process of target detection incorporating an additional process to identify more specifically the material that is present in each pixel that scored high in detection. Detection is generally a 2-class problem of target vs. background, and identification is a many class problem including target, background, and additional know materials. The identification process we present is probabilistic and hierarchical which provides transparency to the process and produces trustworthy output. In this paper we show that target identification has a much lower false alarm rate than detection alone, and provide a detailed explanation of a robust identification method using probabilistic hierarchical classification that handles the vague categories of materials that depend on users which are different than the specific physical categories of chemical constituents. Identification is often done by comparing mixtures of materials including the target spectra to mixtures of materials that do not include the target spectra, possibly with other steps. (band combinations, feature checking, background removal, etc.) Standard linear regression does not handle these problems well because the number of regressors (identification spectra) is greater than the number of feature variables (bands), and there are multiple correlated spectra. Our proposed method handles these challenges efficiently and provides additional important practical information in the form of hierarchical probabilities computed from Bayesian model averaging.

pixel, probability, spectra, (14 more...)

arXiv.org Artificial Intelligence

2207.11212

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)

Add feedback

On the Effectiveness of Mode Exploration in Bayesian Model Averaging for Neural Networks

Holodnak, John T., Wollaber, Allan B.

arXiv.org Machine LearningDec-7-2021

Multiple techniques for producing calibrated predictive probabilities using deep neural networks in supervised learning settings have emerged that leverage approaches to ensemble diverse solutions discovered during cyclic training or training from multiple random starting points (deep ensembles). However, only a limited amount of work has investigated the utility of exploring the local region around each diverse solution (posterior mode). Using three well-known deep architectures on the CIFAR-10 dataset, we evaluate several simple methods for exploring local regions of the weight space with respect to Brier score, accuracy, and expected calibration error. We consider both Bayesian inference techniques (variational inference and Hamiltonian Monte Carlo applied to the softmax output layer) as well as utilizing the stochastic gradient descent trajectory near optima. While adding separate modes to the ensemble uniformly improves performance, we show that the simple mode exploration methods considered here produce little to no improvement over ensembles without mode exploration.

approximation, bayesian model averaging, ensemble, (14 more...)

arXiv.org Machine Learning

2112.03773

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > France (0.04)

Genre: Research Report (0.64)

Add feedback

Dangers of Bayesian Model Averaging under Covariate Shift

Izmailov, Pavel, Nicholson, Patrick, Lotfi, Sanae, Wilson, Andrew Gordon

arXiv.org Machine LearningJun-22-2021

Approximate Bayesian inference for neural networks is considered a robust alternative to standard training, often providing good performance on out-of-distribution data. However, Bayesian neural networks (BNNs) with high-fidelity approximate inference via full-batch Hamiltonian Monte Carlo achieve poor generalization under covariate shift, even underperforming classical estimation. We explain this surprising result, showing how a Bayesian model average can in fact be problematic under covariate shift, particularly in cases where linear dependencies in the input features cause a lack of posterior contraction. We additionally show why the same issue does not affect many approximate inference procedures, or classical maximum a-posteriori (MAP) training. Finally, we propose novel priors that improve the robustness of BNNs to many sources of covariate shift.

bayesian model averaging, covariate shift, danger

arXiv.org Machine Learning

2106.11905

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Add feedback

Filters

Collaborating Authors

bayesian model averaging

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Bayesian Model Averaging under Predictor Redundancy via Density-Ratio Posterior Compression

Dangers of Bayesian Model Averaging under Covariate Shift

Input Adaptive Bayesian Model Averaging

Dangers of Bayesian Model Averaging under Covariate Shift

Dangers of Bayesian Model Averaging under Covariate Shift

Beyond Bayesian Model Averaging over Paths in Probabilistic Programs with Stochastic Support

Automating Model Comparison in Factor Graphs

Target Identification and Bayesian Model Averaging with Probabilistic Hierarchical Factor Probabilities

On the Effectiveness of Mode Exploration in Bayesian Model Averaging for Neural Networks

Dangers of Bayesian Model Averaging under Covariate Shift